Unsupervised learning of word sense disambiguation rules by estimating an optimum iteration number in the EM algorithm
نویسندگان
چکیده
In this paper, we improve an unsupervised learning method using the ExpectationMaximization (EM) algorithm proposed by Nigam et al. for text classification problems in order to apply it to word sense disambiguation (WSD) problems. The improved method stops the EM algorithm at the optimum iteration number. To estimate that number, we propose two methods. In experiments, we solved 50 noun WSD problems in the Japanese Dictionary Task in SENSEVAL2. The score of our method is a match for the best public score of this task. Furthermore, our methods were confirmed to be effective also for verb WSD problems.
منابع مشابه
Learning Probabilistic Models of Word Sense Disambiguation
This dissertation presents several new methods of supervised and unsupervised learning of word sense disambiguation models. The supervised methods focus on performing model searches through a space of probabilistic models, and the unsupervised methods rely on the use of Gibbs Sampling and the Expectation Maximization (EM) algorithm. In both the supervised and unsupervised case, the Naive Bayesi...
متن کاملWord Sense Disambiguation using Association Rules: A Review
Now days, Word Sense Disambiguation (WSD) is a vital area which is very useful in today’s world. Many WSD algorithms are available in literature; we have chosen to an optimal and portable WSD algorithm. We are discussed the supervised, unsupervised, and knowledge-based approaches for WSD. In this paper we are discuses that association rules, Knowledge-based WSD, Corpus-based WSD.
متن کاملUsing Word Embeddings for Bilingual Unsupervised WSD
Unsupervised Word Sense Disambiguation (WSD) is one of the challenging problems in natural language processing. Recently, an unsupervised bilingual WSD approach has been proposed. This approach uses context aware EM formulation for estimating the sense distribution by using the co-occurrence counts of cross-linked words in comparable corpora. WordNetbased similarity measures are used for approx...
متن کاملمعرفی رویکردی ماشینی با استفاده از الگوریتم لسک و برچسبدهی نحوی جهت رفع ابهام از معنای کلمات
The present study introduces a machine-based approach for word sense disambiguation (WSD). In Persian, a morphologically complex language, POS tag which lots of homographs are made, one way for doing WSD is allocating the right Part Of Speech (POS) tags to words prior to WSD. Since the frequency of noun and adjective homographs in different Persian POS tag text corpuses is high, POS tag disambi...
متن کاملWord-Sense Disambiguation of Sinhala Language with Unsupervised Learning
-Resolving ambiguity requires little conscious effort in human communications. To make decisions about the intended sense of a word we use our broad understanding of the language and the real-world knowledge. Disambiguation in translations is the selection of the intended sense from a known finite set of possible meanings of an ambiguous word. This choice is based upon a probabilistic model tha...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2003